SNP detection for massively parallel whole-genome resequencing.
نویسندگان
چکیده
Next-generation massively parallel sequencing technologies provide ultrahigh throughput at two orders of magnitude lower unit cost than capillary Sanger sequencing technology. One of the key applications of next-generation sequencing is studying genetic variation between individuals using whole-genome or target region resequencing. Here, we have developed a consensus-calling and SNP-detection method for sequencing-by-synthesis Illumina Genome Analyzer technology. We designed this method by carefully considering the data quality, alignment, and experimental errors common to this technology. All of this information was integrated into a single quality score for each base under Bayesian theory to measure the accuracy of consensus calling. We tested this methodology using a large-scale human resequencing data set of 36x coverage and assembled a high-quality nonrepetitive consensus sequence for 92.25% of the diploid autosomes and 88.07% of the haploid X chromosome. Comparison of the consensus sequence with Illumina human 1M BeadChip genotyped alleles from the same DNA sample showed that 98.6% of the 37,933 genotyped alleles on the X chromosome and 98% of 999,981 genotyped alleles on autosomes were covered at 99.97% and 99.84% consistency, respectively. At a low sequencing depth, we used prior probability of dbSNP alleles and were able to improve coverage of the dbSNP sites significantly as compared to that obtained using a nonimputation model. Our analyses demonstrate that our method has a very low false call rate at any sequencing depth and excellent genome coverage at a high sequencing depth.
منابع مشابه
SNP discovery in apple cultivars using next generation sequencing
Background Knowledge about single nucleotide polymorphism (SNP) markers is extremely important in the development of genotyping assays, allowing improvements in plant breeding through marker-assisted selection. With the emergence of next generation sequencing platforms, high-density SNP discovery in the genome of plant crops becomes more achievable. In this project, we carried out whole genome ...
متن کاملDoes massively parallel DNA resequencing signify the end of histopathology as we know it?
Next-generation DNA sequencing devices have revolutionized cancer genomics by bringing whole genome resequencing of patients' tumours within practical and economic reach. We present an overview of the techniques involved and review early results from the resequencing of cancer genomes. The possible impacts of whole-genome and trancriptome resequencing in clinical cancer research and the practic...
متن کاملTarget Amplicon Sequencing for Genotyping Genome-Wide Single Nucleotide Polymorphisms Identified by Whole-Genome Resequencing in Peanut.
Genome-wide genotyping data regarding breeding materials are essential resources for improving breeding efficiency, especially in plants with complex genomes with a high degree of polyploidy. Several current breeding efforts in cultivated peanut ( L.), which has a tetraploid genome, are devoted to developing high oleic acid cultivars. Genetic maps for such breeding programs have been developed ...
متن کاملGenome-Wide SNP Calling Using Next Generation Sequencing Data in Tomato
The tomato (Solanum lycopersicum L.) is a model plant for genome research in Solanaceae, as well as for studying crop breeding. Genome-wide single nucleotide polymorphisms (SNPs) are a valuable resource in genetic research and breeding. However, to do discovery of genome-wide SNPs, most methods require expensive high-depth sequencing. Here, we describe a method for SNP calling using a modified ...
متن کاملNpgRJ_Nmeth_1179 183..188
Massively parallel sequencing instruments enable rapid and inexpensive DNA sequence data production. Because these instruments are new, their data require characterization with respect to accuracy and utility. To address this, we sequenced a Caernohabditis elegans N2 Bristol strain isolate using the Solexa Sequence Analyzer, and compared the reads to the reference genome to characterize the dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genome research
دوره 19 6 شماره
صفحات -
تاریخ انتشار 2009